skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Gawron, Rebecca"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Eukaryotic diversity is largely microbial, with macroscopic lineages (plant, animals and fungi) nesting among a plethora of diverse protists. Understanding the evolutionary relationships among eukaryotes is rapidly advancing through omics analyses, but phylogenomics are challenging for microeukaryotes, particularly uncultivable lineages, as single-cell sequencing approaches generate a mixture of sequences from hosts, associated microbiomes, and contaminants. Moreover, many analyses of eukaryotic gene families and phylogenies rely on boutique datasets and methods that are challenging for other research groups to replicate. To address these challenges, we present EukPhylo v1.0, a modular, user-friendly pipeline that enables effective data curation through phylogeny-informed contamination removal, estimation of homologous gene families (GFs), and generation of both multisequence alignments and gene trees. Analyses can use a hook database of ~15k ancient GFs or users can easily replace this hook with a set of gene families of interest. We demonstrate the power of EukPhylo, including a suite of stand-alone utilities, through analyses of 500 conserved GFs sampled from 1,000 diverse species of eukaryotes, bacteria and archaea. We show improvements in estimates of the eukaryotic tree of life, recovering clades that are well established in the literature, through successive rounds of curation using the EukPhylo contamination loop. The final trees corroborate numerous hypotheses in the literature (e.g. Opisthokonta, Rhizaria, Amoebozoa) while challenging others (e.g. CRuMs, Obazoa, Diaphoretickes). We believe that the flexibility and transparency of EukPhylo sets standards for curation of omics data for future studies. 
    more » « less